Estimating the distribution of dynamic invariants: illustrated with an application to 

human photo-plethysmographic time series 
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Dynamic invariants are often estimated from experimental time series with the aim of differen- 
tiating between different physical states in the underlying system. The most popular schemes for 
estimating dynamic invariants are capable of estimating confidence intervals, however such confi- 
dence intervals do not reflect variability in the underlying dynamics. In this communication we 
propose a surrogate based method to estimate the expected distribution of values under the null hy- 
pothesis that the underlying deterministic dynamics are stationary. We demonstrate the application 
of this method by considering four recordings of human pulse waveforms in differing physiological 
states and provide conclusive evidence that correlation dimension is capable of differentiating be- 
tween three (but not all four) of these states. 
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Various dynamic invariants are often estimated from 
time series in a wide variety of scientific disciplines. It 
has long been known that these estimates (and in partic- 
ular correlation dimension estimates) alone are not suf- 
ficient to differentiate between chaos and noise. Most 
notably, the method of surrogate data {Ij was introduced 
in an attempt to reduce the rate of false positives dur- 
ing the hunt for physical examples of chaotic dynamics. 
Although it is not possible to find conclusive evidence 
of chaos through estimation of dynamic invariants, sur- 
rogate methods are used to generate a distribution of 
statistic (i.e. the estimates of the dynamic invariant) 
values under the hypothesis of linear noise. In the most 
general form, the standard surrogate methods can gen- 
erate the distribution of statistic values under the null 
hypothesis of a static monotonic nonlinear transforma- 
tion of linearly filtered noise. 

In this communication, we introduce a significant gen- 
eralisation of a recent surrogate generation algorithm 
0, O- The pseudo-periodic surrogate (PPS) algorithm 
allows one to generate data consistent with the null hy- 
pothesis of a noise driven periodic orbit — provided the 
data exhibits pseudo-periodic dynamics. This algorithm 
has been applied to differentiate between a noisy limit 
cycle, and deterministic chaos. By modifying this algo- 
rithm and applying it to noisy time series data, we are 
able to generate surrogate time series that are indepen- 
dent trajectories of the same deterministic system. 

This ensemble of attractor trajectory surrogates (ATS) 
can then be used to estimate the distribution statistic 
values for estimates of any statistic derived from these 
time series. The statistics of greatest interest to us are 
dynamic invariants of the underlying attractor, and in 
particular correlation dimension and entropy estimates 
provided by the Gaussian kernel algorithm (GKA) 
Our choice of the GKA is entirely arbitrary, but based 
on our familiarity with this particular algorithm. 

An important application for the ATS technique is to 
determine whether dynamic invariants estimated from 



distinct time series are significantly different. The ques- 
tion this technique can address is whether (for exam- 
ple) a correlation dimension of 2.3 measured during nor- 
mal electrocardiogram activity is really distinct from the 
correlation dimension of 2.4 measured during an episode 
of ventricular tachycardia ^6^ J7j . Estimates of dynamic 
invariants (including the GKA 0, Q) often come with 
confidence intervals. But these confidence intervals can 
only be based on uncertainty in the least-mean-square 
fit, not the underlying dynamics. Conversely, it is stan- 
dard practice to obtain a large number of representative 
time series for each (supposedly distinct) physical state, 
and compare the distribution of statistic values derived 
from these. But, this approach is not always feasible: in 
1^ 0] for example, the problem is not merely that these 
physiological states are both difficult and dangerous to 
replicate, but that inter-patient variability makes doing 
so infeasible. 

In the remainder of this communication we describe 
the new ATS algorithm and demonstrate that it can be 
used to estimate the distribution of dynamic invariant 
estimates from a single time series of a known dynamical 
system (the chaotic Rossler system). We then apply this 
same method to four recordings of human pulse wave- 
forms, measured via photo-plethysmography Each 
of the four recordings correspond to a distinct physiologi- 
cal state. We compute correlation dimension and entropy 
using the GKA method and show that the expected dis- 
tribution of correlation dimension and entropy estimates 
are sufficient to differentiate between these four physio- 
logical states. 

The ATS algorithm may be described as follows. Em- 
bed a scalar time series {xt} to obtain a vector timeseries 
{zt} (of length N). The choice of embedding is arbitrary, 
but has been adequately discussed in the literature ([l^ 
for example). From the embedded time series, the surro- 
gate is obtained as follows. Choose an initial condition, 
wi £ {zt\t = 1, ■ ■ • , N}. Then, at each step, choose the 
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where the noise radius p is an as-yet unspecified con- 
stant. In other words, the successor to Wt is the succes- 
sor of a randomly chosen neighbour of Wt ■ Finally, from 
the vector time series {wt} the ATS {st} is obtained by 
projecting Wt onto [1 • • • 0] (the first coordinate). 
Hence 



St 



Wt 



[100 



01 



(2) 



In Hi this algorithm was shown to be capable of 
differentiating between deterministic chaos and a noisy 
periodic orbit. In the context of the current communi- 
cation we assume that {xt} is contaminated by additive 
(but possibly dynamic) noise and we choose the noise 
radius p such that the observed noise is replaced by an 
independent realisation of the same noise process. Fur- 
thermore, we assume that the deterministic dynamics are 
preserved by suitable choice of embedding parameters. 
Under these two assumptions, {zt} and {wt} have the 
same invariant density and {cct} and {s(} are therefore 
(noisy) realisation of the same dynamical system with 
(for suitable choice of p) the same noise distribution. 

As in 0,13 the problem remains the correct choice of p. 
This is the major difference between the ATS described 
here and the PPS of However, since the null hy- 

pothesis we wish to address is different from (and more 
general than) that of the PPS, choice of p for the ATS 
is less restrictive. For t — T given, one can compute 
P(wt+i ^ Zi+i A II Wt — ^ill = 0\t = T) directly from the 
data by applying Assuming the process is ergodic 
|ll| one can then sum 
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to get the probability of a temporal discontinuity [l3| 
in the surrogate at any time instant. There is a 1:1 
correspondence between a value p = P{wt+i ^ Zi+i A 
||wt — Zill =0) and p, and we choose to implement ^ 
for a particular value of p (i.e. a particular transition 
probability) rather than a specific noise level. In what 
follows we find that studying intermediate values of p 
{p G [0.05,0.95]) is sufficient. However, the significant 
point is that p G [0.05, 0.95] corresponds to a very nar- 
row range of values of p. 

We now demonstrate the applicability of this method 
for noisy time series data simulated from the Rossler dif- 
ferential equations (during "broad-band" chaos). We in- 
tegrated (1000 points with a time step of 0.2) the Rossler 
equations both with and without multidimensional dy- 
namic noise at 5% of the standard deviation of the data. 
We then studied the x-component after the addition of 
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FIG. 1: Distribution of statistics D, K and S for short 
and noisy realisations of the Rossler system. The his- 
togram shows the distribution of statistic estimates (D, K and 
S) for 500 ATS time series generated from a 1000 point real- 
isation of the Rossler system. The solid vertical line on each 
plot is the comparable value for the data and the stars marked 
on the horizontal axes are for 20 independent realisations of 
the same process. The top row of figures depicts results for 
the Rossler system with observational noise only, the bottom 
row of figures has both observational and dynamic noise. Pan- 
els (a) and (d) show correlation dimension estimates, (b) and 
(e) are entropy, and (c) and (f) are noise level. 



5% observational noise. We selected embedding param- 
eters using the standard methods (de = 3 and r = 8) 
and then compute ATS surrogates for various exchange 
probabilities p = 0.05, 0.1, 0.15, 0.95. For the data 
set and each ensemble of surrogates we then estimated 
correlation dimension Z), entropy K and noise level S 
using the GKA algorithm 0, H (GKA embedding usmg 
embedding dimension m = 2, 3, . . . , 10 and embedding 
lag of 1). Figure n depicts the results when the GKA is 
applied with embedding dimension m — A and the ex- 
change probability is p — 0.35. Other values of m gave 
equivalent results, as did various values of p in the range 
[0.2,0.8]. 

For p e [0.2,0.8] we found that the estimate of noise 
S from the GKA algorithm coincided for data and surro- 
gates, but this was often not the case for extreme values 
of p. Therefore, this estimate of signal noise content is 
a good indicator of the accuracy of the dynamics repro- 
duced by the ATS time series. Furthermore to confirm 
the spread of the data we also estimated D, K, and S for 
20 further realisations of the same Rossler system (with 
different initial conditions). In each expected, 
the range of these values lies well within the range pre- 
dicted by the ATS scheme. 

We now consider the application of this method to 
photo-plethysmographic recordings of human pulse dy- 
namics over a short time period (about 16.3 seconds). 
We have access to only a limited amount of data rep- 
resentative of each of four different dynamic regimes. In 
any case, we would expect the system dynamics to change 
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FIG. 2: Human pulse waveform recorded with photo- 
plethysmography. Four recordings of iiuman pulse wave- 
form (61 Hz) in four different piiysiological conditions. The 
four time series correspond to: (a) normal, (b) quasi-stable, 
(c) unstable, and (d) post-operative (stable). 



if measured over a significantly longer time frame. The 
data collection and processing with the methods of non- 
linear time series analysis are described in . Previ- 
ously, we have studied nonlinear determinism in cardiac 
dynamics measured with electrocardiogram (ECG) 
Although we do not consider ECG data here, this data 
would be another useful system to examine with these 
methods The four data sets we examine in this 

communication are depicted in figure |21 

For each data set we repeated the analysis described 
for the Rossler time series. Results for GKA embedding 
dimension m = 4 and p = 0.35 are depicted in figure 
|31 As with the Rossler system, variation of the param- 
eters m and p did not significantly change the results. 
We find that in every case (except for p ^ [0.2, 0.8]) the 
distribution of D, K and S estimated from the ATS data 
using the GKA included the true value. Most signif- 
icantly, this indicates that the range of values of p is 
appropriate. Moreover, these results are consistent with 
the hypotheses that the noise is effectively additive and 
can be modelled with this simple scheme, and that the 
underlying deterministic dynamics can be approximated 
with a local constant modelling scheme. 

We also estimated the statistics D, K and S for addi- 
tional available data (subsequent, contiguous, but non- 
overlapping) from each of the four rhythms. This small 
amount of data afforded us two or three additional esti- 
mates of each statistic for each rhythm. For the unstable 
and quasi-stable rhythm we observed good agreement. 
For the stable (normal and post-operative) rhythms, this 
is not the case. On examination of the data we find 
that this result is to be expected. Both the stable 
rhythms undergo a change in amplitude and baseline sub- 
sequent to the end of the original 16 second recording, 
this non-stationarity is reflected in the results. This same 
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FIG. 3: Distribution of statistics D, K and S for human 
pulse waveforms. The histogram shows the distribution of 
statistic estimates {D, K and S) for 500 ATS time series 
generated from each of the four time series depicted in figure 
1^ The solid vertical line on each plot is the comparable value 
for the data and the stars marked on the horizontal axes are 
for the (limited) subsequent data recorded from each patient. 
In each case only two or three subsequent contiguous but non- 
overlapping timeseries were available. The figures are: (a) 
correlation dimension (D), (b) entropy (K), and (c) noise 
(S) for the normal rhythm; (d) D, (e) K, and (f) S for the 
quasi-stable rhythm; (g) D, (h) K, and (i) S for the unstable 
rhythm; and (j) D, (k) K, and (1) S for the post-operative 
stable rhythm. 



non-stationarity has also been observed independently in 
Bhattacharya and co-workers . 

We now return to the question that the ATS test was 
designed to address: can we differentiate between these 
four rhythms based on the GKA? Figure 0] provides the 
answer. In figure 0] we see the estimated distribution of 
statistic values {D, K and S) for each of the four rhythms 
shown in figure|2| Clearly (and not surprisingly) , the cor- 
relation dimension estimate and noise level of the unsta- 
ble rhythm is significantly different from the other three 
rhythms. More significantly, the quasi-stable rhythm is 
also observed to be distinct from the other three regimes. 
Furthermore, we observe that in the quasi-stable state 
the correlation dimension estimate is significantly less 
than one, while for the unstable state it is significantly 
larger than one. For example, the quasi-stable state may 
be characterised by a noise driven stable focus 0| , while 
the unstable state exhibits high dimensional (i.e. D > 1) 
deterministic dynamics. Both these regimes exhibit sig- 
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FIG. 4: Discriminating power of the statistics D, K 
and S for human pulse waveforms. The distribution (a 
binned histogram) of statistic values estimated via the ATS 
method (as described in figure O for each of the four distinct 
physiological waveforms is shown. The four rhythms are la- 
belled 'a', 'b', 'c', and 'd' corresponding to the same labelling 
in figure |5| These figures show that correlation dimension 
alone is sufficient to differentiate between three of these four 
physiological states. The exception is that these three statis- 
tics are insufficient to differentiate between the normal and 
post-operative states. 



nificantly more (additive Gaussian) noise than the stable 
states. 

The two stable states (panels (a) and (d) of figure [^J 
are harder to distinguish: both visually and using the 
statistics D, K, and S. While the individual data sets 
we depict in figure |21 exhibit difi'erent statistic values (for 
example D = 1.06 and D = 1.01), we find that the 
ATS analysis indicates that these statistics are not sig- 
nificantly different. Both regimes exhibit a correlation 
dimension of about one, and a similar noise level. The 
variation in observed results is lesser in the post-operative 
stable regime, but the distribution do overlap. 

Finally, we find that entropy estimated with the GKA 
algorithm K is of no use in differentiating between these 
four rhythms. 

The results of this analysis are in general agreement 
with those presented in [H, 0] . Independent linear sur- 
rogate analysis H has confirmed that each of these 



four rhythms is inconsistent with a monotonic nonlinear 
transformation of linearly filtered noise j^^. The only 
significant difference is that the correlation dimension es- 
timates we present here are significantly lower than those 
in 8, 9J. This is due to the different correlation dimen- 
sion algorithm. Unlike the algorithm employed in [^|^, 
the GKA seperates the data into purely deterministic and 
stochastic components, and hence estimates both D and 
S. The correlation dimension estimated in |^ 0| is the 
combined effect of both components of the GKA. 

Although we have considered the specific application of 
human pulse dynamics, the algorithm we have proposed 
may be applied to a wide variety of problems. We have 
shown that provided time delay embedding parameters 
can be estimated adequately, and an appropriate value 
of the exchange probability is chosen, the ATS algorithm 
generates independent trajectories from the same dynam- 
ical system. When applied to data from the Rossler sys- 
tem we confirm this result, and we demonstrate its ap- 
plication to experimental data. 

When the ATS algorithm is applied to generate inde- 
pendent realisation for a hypothesis test, one is able to 
construct a test for non-stationarity. If two data sets do 
not fit the same distribution of ATS data then they can 
not be said to be from the same deterministic dynamical 
system. Unfortunately, the converse is not always true 
and the power of the test depends on the choice of statis- 
tic. The utility of this technique as a test for stationarity 
remains a subject for future investigation. 
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